ml The MIT Press 





A Formula for the Gini Coefficient 

Author(s): Robert Dorfman 

Source: The Review of Economics and Statistics, Vol. 61, No. 1 (Feb., 1979), pp. 146-149 
Published by: The MIT Press 

Stable URL: http://www .jstor.org/stable/1924845 


Accessed: 18/09/2013 16:54 





Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at 
http://www.jstor.org/page/info/about/policies/terms.jsp 


JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of 
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms 
of scholarship. For more information about JSTOR, please contact support @jstor.org. 


The MIT Press is collaborating with JSTOR to digitize, preserve and extend access to The Review of 
Economics and Statistics. 





http://www.jstor.org 


This content downloaded from 146.232.129.75 on Wed, 18 Sep 2013 16:54:05 PM 
All use subject to JSTOR Terms and Conditions 


146 


Conclusion 


It would be naive to claim that the above estimates 
are accurate; clearly, they are based on scattered and 
incomplete data. But the fact that income data, unem- 
ployment data, as well as the estimates of the severity 
of recessions, all point in the same direction does 
suggest that output fluctuations were more severe 
prior to 1930 than in the postwar period, though the 
difference appears to be moderate. 

But while these data thus support Modigliani they do 
not support his conclusion that discretionary stabiliza- 
tion policies have been successful. This is so because, 
as table 2 shows, the money growth rate (measured 
only by M,, reliable M, data not being available) was 
more variable prior to the Great Depression than in the 
postwar period. Hence, monetarists and other sup- 
porters of a stable growth rate rule, would expect that 
unemployment was higher prior to the Great Depres- 
sion. 


TABLE 2.—VARIATIONS IN THE MONETARY GROWTH RATE 








Change Over 
Year to Year Change 2 Year Period 


Standard Coefficient Standard Coefficient 
Deviation of Variation Deviation of Variation 














1890-1899 6.3 1.1 10.8 1.1 
1900-1909 4.1 0.5 7.0 0.4 
1910-1916 5.0 0.7 6.2 0.4 
1923-1929 3.1 0.6 4.4 0.4 
1950-1959? 1.1 0.3 2.9 0.5 
1960-1969 2:2 0.4 4.2 0.3 
1970-1975 2.6 0.3 4.4 0.2 
Sources: Bureau of the Census (1976, p. 992), Board of Governors (1976a, pp. 


18-19; 1976b, p. 49). 
4 Excludes 1951-1952. 


Furthermore, even if the variance of the money 
growth rate had been the same in both periods, the 
higher unemployment rate prior to the Great Depres- 
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sion would not suffice to build a case for discretionary 
monetary policy. In the pre-Depression period sharp 
declines in the money growth rate were associated 
with bank failures during a recession, and hence were 
very badly timed with regard to stabilization. With 
discretionary monetary policy, declines in the money 
growth rate can be better timed than that, and still 
yield worse results than a constant growth rate rule 
would. 
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A FORMULA FOR THE GINI COEFFICIENT 


Robert Dorfman* 


The Gini Coefficient is well established as a conven- 
tional, ad hoc measure of income inequality. Recently 
there has been a flurry of interest in it, stirred up by a 
debate about its significance as a measure of economic 
welfare (Atkinson, 1970; Dasgupta et al., 1973; New- 
bery, 1970; Rothschild and Stiglitz, 1973; Sen, 1973; 
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Sheshinski, 1972) in the course of which a confusing 
variety of formulas for the coefficient have been pub- 
lished, some of them quite complicated (Atkinson, 
1970; Fei, 1978; Sen, 1973; Theil, 1967, for example). 
This note will propose a simple formula for the Gini 
Coefficient that will apply to both discrete and con- 
tinuous distributions of income and will be well- 
defined and valid whether or not there is a finite upper 
limit to the income that can be received by anyone, 
provided the mean of the distribution is finite. 
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NOTES 


Gastwirth (1972) has published a similar formula with- 
out proof, attributing it to Kendall and Stuart (1977), 
where the proof also is omitted. 

The formula to be proposed is 


Glia i (1 — F(y))*d (1) 
bdo Ys 
where 
F(y) is the cumulative probability distribution of in- 


come, 
wis its mean, assumed finite, and 
y* is its upper limit, which may be infinite. 


The Gini Coefficient can be approached from either 
of two directions. First, it can be regarded as the 
salient summary statistic of the Lorenz Curve of the 
income distribution. The Lorenz Curve, to be denoted 
L(u), is the proportion of the total income of the econ- 
omy that is received by the lowest 100u% of income 
receivers. (A more formal definition will be given be- 
low.) From this point of view, the Gini Coefficient is 
the area between a given Lorenz Curve and the Lorenz 
Curve for an economy in which everyone receives the 
same income, expressed as a proportion of the area 
under the curve for the equal distribution of income. 
This definition leads to the formula 


G=142 f. L(u)du, (2) 


as will be demonstrated below. A formula for the area 
under a Lorenz Curve will be derived en route. It is 


1 y* 
| L(u)du = 1 | (1 — F(y)Pdy. (3) 

0 Qu Jo 
Equation (1) follows immediately from equations (2) 
and (3). 

Second, Gini himself proposed the coefficient that 
now bears his name as a measure of the variability of 
any statistical distribution or probability distribution. 
(See Gini (1912), for example.) Specifically, he based 
his coefficient on the average of the absolute differ- 
ences between pairs of observations, and defined it to 
be the ratio of half of that average to the mean of the 
distribution. We shall see that that definition also leads 
to equation (1). : 


Derivation from the Lorenz Curve 


In this section we derive the formula for the Gini 
Coefficient from its definition in terms of the Lorenz 
Curve. But first we must formulate the Lorenz Curve 
in more detail. 

Let F(y) denote the proportion of the population 
that receives incomes no greater than y. F(y) need not 
be continuous but since it is monotonic it can have no 
more than a countable infinity of points of discon- 
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tinuity. Furthermore, it is everywhere one-sided con- 
tinuous both to the left and to the right, and by virtue 
of its definition F(y+) = F(y). We assume throughout 
that the points of discontinuity, if any, are isolated, 
that F(y) is differentiable between those points, and 
that the income distribution has a finite mean, pw. 

The Lorenz Curve, L(u), is the function of u, for0 = 
u <= 1, that specifies the proportion of aggregate in- 
come that goes to the members of the population in the 
lowest 100u% of the income distribution. To relate 
L(u) to the distribution of income, regard y as the 
function of u, say y(u), that specifies the largest in- 
come such that the proportion of the population whose 
incomes do not exceed any lower income does not 
exceed u. Then y(u) is defined by F(y(u) -) Sus 
F(y(u)). 

We shall write 


I(y) = if xdF(x), 


a Stieltjes integral, for the total income accruing to 
members of the population whose incomes do not ex- 
ceed y divided by the total number of members of the 
population. We shall also use the convention 

z= e sere 

|, feddg(x) = lim ["* fex)de@). 

In this notation the total income accruing to the lowest 
100u% of the population is 


I(yu) — ) + yw@)[u — FO) — J] 


multiplied by the size of the population. Since the 
aggregate income of the population is y times the size 
of the population, the proportion of income received 
by the lowest 100u% is 


L(u) = - [(y(u) — ) + y(u)(u — Fw) — ))I. 


The definition of the Gini Coefficient depends on the 
area, A, under the Lorenz curve, or 


A= i L(u)du. 
To evaluate this integral divide the range of u into 
segments at the values of u corresponding to the (iso- 
lated) discontinuities of F(y). Suppose these discon- 
tinuities occur at yo = 0, y;, yo, . - - » Ye (kK may be 
infinite). y(u) = y,; when F(y; — ) <= u = F(y;), which 
we shall abbreviate to F;; = u Ss F,. Then 


$f 


jay 7% t-1 


L(u)du + ie Eta (4) 


where &k (possibly infinite).is the index of the largest 
discontinuity of F(y). 
The contribution of the i** segment is 


a Wow) =) + yaru ~ Few) = yd 
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For Fi_, < u = F;-, the second term is zero and F(y(u)) 
=u. For F; = u S F,, the first term is constant ( = 
I(Qy;-)) and y(u) = y,. Thus 


fi La)du = < I. Iy)dF(y) 


+ (F,— Fr) — ) + MF: - FE) i 
To simplify, notice 


IQ; — ) = 10%) -— yi(Fi -— Fi) 
and 


[" 10)4Fo) = - | 


i-1 Yi-1 


Yi 


I(Q)d(1 — FQ) 
= = (1- Fri — ) + = Fale) 
- |" = Fon)y dl — FO). 
Further, the last integral equals 
a[" ya — Foyt = 4" ya — FOVy 
— by - Fy - 1 - Fry). 


Putting these all together, the terms involving y; cancel 
out and there remains 


Fi 
| L(u)du 
Fi-y 


= ral — (1 - FIO) + 1 - Fae) 


Ui 
- 4)" yaa — Foy? 
Yi-1 
By similar manipulations, 


[,, udau = £ | ~ Felon) ~~ FOTO") 
Fr ad 
-4{" saa - Foy, 


and, adding up, 


A= ral — (1 = FO*)Io*) = 4 |” yaa - Fo). 


y 
0 
Finally, integrating by parts, 
1 
A= ral — (1 — FO*)IQ*) — #0 — FO*))y* 


+4[" a Foye] (5) 


where y*, which may be infinite, denotes the upper 
limit of the income distribution. 

If y* is finite, F(y*) = 1 and the integral is all that 
remains.! Otherwise we must assure ourselves that the 


' Professor Andrew Gleason assisted greatly with the fol- 
lowing argument. 
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expression on the right converges as y* > o. By 
straightforward algebra, for any finite y*: 


[" a = Foy)Pdy ~ 0 = FO*)y* 


= |" (FO*) = FOy)Pay 


y* 


+ 2(1 — FO*)) |” (FO*) - Fo))dy. 


Similarly, we can see that 


[° O*) — FO))ay = 10"). 


(6) 


Thus the right hand side of equation (5) reduces to 


— [" (FO*) — FO))*dy $ 


2p 


[" @O*) — Fo)dy = 10*)/2u = 1/2 


since I(y*) <= w. Convergence is therefore assured. If 
there is no upper bound to the distribution of income, 


A=tim 1. { ” (F(y*) — Fly))2dy 
2p /0 


ah a 
= | Os FO: (7) 


which is valid in any case. 
In the important special case in which F(y) is a 


step-function with jumps at yp = 0, y,, y2,...-3 FY) = 
F,_, for y;-1 = y < y,;. Then equation (7) reduces to 


[ Lwdu = A= Y= Fin - 
7 aa) 


We can now derive the Gini Coefficient itself. It is 
defined as the difference between A and the area under 
the Lorenz Curve for a population in which everyone 
receives the same income (namely 2), to be denoted 
Ae, expressed as a proportion of A,. That is, 


A,—A 


G= z 


For the curve of equal distribution, F(y) = 0 for y < w 
and F(y) = 1 for y = w. By either equation (7) or (8), 


_ il _ 1 
a aa 
and 
Gsi-2 1/0 2 
=I =| Gay 


which is equation (1). 
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NOTES 


Example 1 


For the Pareto Distribution, F(y) = 0, y = 1; F(y) = 
1-—y*, y= 1; = a/(@ — 1); a > 1. Formula (1) 
yields 


a-l 


o=1- 221 [fas [mal 








Example 2 


Consider a geometric distribution with i = 0, 1, 2, 


...3k = ©, y; = 7; and where the proportion of the 
population that receives i is fy = 0, f, = (1 — a)/a)a’,i 
=1,2,...;0<a<1.Thenyw = 1/(1-—a), F,=1- 
a’, Equation (8) gives 








[ Lwdu = l-a Sat 
0 2 7 
_tl-a 1 - 1 
~ 2 1T-@ 2 +a) 
Then, by equation (2), 
1 _ a 
el Ita 1+a’ 


Derivation from the Income Distribution? 


The second definition of the Gini Coefficient, Gini’s 
own, is based directly on the distribution of income, 
F(y). In words, it is half the ratio of the average abso- 
lute difference between pairs of observations to the 
mean, «. Insymbols, let A = E|x — y|. ThenG = A/2wu. 
Now 


lk of = 2(2 5) — mince») 





so 


A = 2(u — E min (x,y)). 


_? The theorem in this section and its proof were contributed 
by the referee, Joseph L. Gastwirth. 
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Further, 


Prob(min(x,y) = z) = 1 — Prob(x > z) Prob(y > z) 


ei Fey 


which is the cumulative distribution function of 
min(x,y). So 


eee ee ig zd(1 — (1 — F(z))) 


Soe i zd(1 — F(z). 


We have already evaluated this integral in the course 
of deriving equation (5). Thus 


i 
=A =f o Es 
G= A/mm =1 7 \, (1 — F(y))?dy, 


as before, since the term y*(1 — F(y*))? vanishes 
whether y* is finite or not. 
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